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Abstract 

Many NASA planning problems are over-subscription 
problems - that is, there are a large number of possible goals 
of differing value, and the planning system must choose a 
subset that can be accomplished within the limited time and 
resources available. Examples include planning for 
telescopes like Hubble, SIRTF, and SOFIA; scheduling for 
the Deep Space Network; and planning science experiments 
for a Mars rover. Unfortunately,, existing planning systems 
are not designed to deal with problems like this - they expect 
a well-defined conjunctive goal and terminate in failure 
unless the entire goal is achieved. In this paper we develop 
techniques for over-subscription problems that assist a 
classical planner in choosing which goals to achieve, and the 
order in which to achieve them. These techniques use plan 
graph cost-estimation techniques to construct an orienteering 
problem, which is then used to provide heuristic advice on 
the goals and goal order that should be considered by a 
planner. 

1. Introduction 

Many NASA planning problems are over-subscription prob- 
lems — that is, there are a large number of possible goals of 
differing value, and the planning system must choose a sub- 
set that can be accomplished within the limited time and re- 
sources available. For example, space and airborne 
telescopes, such as Hubble, SIRTF, and SOFIA, receive 
many more observation requests than can be accommodated. 
As a result, only a small subset of the desirable requests can 
be accomplished during any given planning horizon. For a 
Mars rover mission, there are many science targets that the 
planetary geologists would like to visit. However, the rover 
can only visit a few such targets in any given command cycle 
because of time and energy limitations, and limitations on 
th e, rover’s a bility to track targ ets. . ^ ; . 

Unfortunately, planning systems are generally not de- 
signed to deal with over-subscription problems. Most sys- 
tems expect to be given a well-defined conjunctive goal and 
attempt to synthesize a plan to achieve the entire goal. They 
are not able to consider the values of the different goals, or 
to choose an appropriate subset of the goals that can be ac- 
complished within the limited time and resources available. 

In practice, most over-subscription problems have been 
addressed by using simple “greedy” approaches. For an 


earth-observing satellite (where slewing is not possible, or 
slewing times are short) one can create a reasonable obser- 
vation schedule by considering observations in descending 
order of their importance or priority. If the observation being 
considered is still possible, it is added to the schedule; oth- 
erwise it is discarded. This approach can work reasonably 
well for problems in w'hich the cost of achieving an objective 
does not depend on the order in which the objectives are 
achieved. Unfortunately, this assumption does not hold for a 
Mars rover. The reason is that there is a significant cost in 
moving from one target to the next, and that cost depends on 
the distance and terrain between the two targets. In other 
words, the ordering of the targets has a strong influence on 
the overall cost of visiting those targets. As a result, much 
more powerful and informed search heuristics are required 
to help a planner choose the targets to visit and the order in 
which to visit them. _ . 

In this paper, we develop a technique for solving over- 
subscription planning problems. The technique involves 
constructing an abstracted version of the planning problem, 
and then using the resulting solution(s) to provide heuristic 
advice to the planner on the goals and steps that, should, be 
considered, and the order in which they should be consid- 
ered. The abstracted version of the problem is formed by 
first estimating the costs of achieving each, different objec- 
tive (goal) using a plan graph. This information is then used 
to construct an orienteering problem , a variant of a traveling 
salesman problem. Solutions to the orienteering problem 
then provide the heuristic information needed for guiding 
the planner. 

In the next section, we discuss plan graph distance esti- 
mation techniques, and show why they alone are not ade- 
quate for guiding search in over-subscription problems. In 
Sections _we introduce the orienteering_probIem,_and_show 
how an orienteering problem coupled with plan graph dis- 
tance estimation techniques can provide a useful abstraction 
of a rover planning problem. We then show how the solution 
to this orienteering problem can provide guidance to the 
planner search process. Throughout these two sections we 
limit our attention to a simple rover planning problem, 
where the mapping between the problem and the orienteer- 
ing problem is relatively obvious. In Section 4, we general- 
ize the graph construction and solution process so that it 
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applies to arbitrary planning problems. In Section 5, we 
present some preliminary experimental results on the rover 
problem. Finally we discuss related work and some current 
limitations of the approach. 


2. Plan Graph Distance Estimates 

A number of recent high-performance planning systems use 
a plan graph (Blum & Furst, 1997) to compute an estimate 
of the resources and time required to achieve goals from 
states encountered in the search process (e.g. Hoffman 2002, 
2003; Do & Kambhampati 2003; Edelkamp 2003). This in- 
formation is used to select among the different alternative 
search states. 1 To see how this works, consider the simple 
rover example shown in Figure 1, in which there are three 
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Figure 1: Simple rover scenario with three target locations. A 
sample and an image are desired at the first location, an image at 
the second, and a sample at the third. Only certain traversal paths 
are assumed possible because of terrain and visual tracking 
limitations. Numbers indicate time required to traverse paths. 


target^ locations (rocks) with various paths along which the 
rover can travel. (Assume these paths are precomputed using 
path planning algorithms.) Both a sample and an image are 
desired at Locj, an image is desired at Loc 2 and a sample at 
Loc-r. 


Figure 2 shows an abbreviated plan graph for the first two 
levels of this simple problem. Level 1 of the graph shows 
that it is possible for the rover to reach any of the three target 
locations in only one step. Level 2 shows that any individual 
experiment can be achieved after only two steps. Thus, the 
plan graph provides an optimistic assessment of when ac- 
tions and propositions are possible. Since actions take time 
and resources, the graph can be used to compute estimates 
of the time and resources needed to achieve a goal or objec- 
, five. In the example in Figure 2, numbers next to the actions 
indicate the cost of each action. With these numbers, we can 
use the graph to estimate the cost of achieving each of the 
objectives at level 2. For example, in order to have a sample 
at rock .1 , we would need to move to rock 1 and then collect 
the sample, giving a cost of 4-t-3=7. These cost calculations 
can be done very rapidly in a plan graph by a simple forward 
sweep through the graph. During planner search, this kind of 
heuristic “distance measure” can be used to select between 


1. More precisely, these planning systems use this distance infor- 
mation to extract a relaxed plan for the goals, then use this relaxed 
plan as an estimate of the cost of achieving the goal from the 
search state. 




Figure 2: A portion of a simple plan graph for the rover problem. 

y represents a move operation from x to y, Sa x represents a 
sampling operation, and lm x a close up image operation at location 
x. Numbers next to actions indicate action costs. Numbers next to 
propositions are the cost estimates for achieving those 
propositions. For simplicity, mutual exclusion relationships are 
not shown. 


different possible ways of achieving a goal. For example, if 
the goal is to have the sample at location 3, then it is better 
to go directly there rather than via either location 1 or loca- 
tion 2. If a direct path is not available, the graph would tell 
us that it is better go via location 2 (cost 8) rather than via 
location 1 (cost 9). It is this idea that provides much of the 
basis for the search guidance used by the competitive plan- 
ners in the. recent planning competition (Long & Fox, 2003). 
For over-subscription problems, we could use the same strat- 
egy to estimate the cost of achieving each possible goal from 
the current state, and then try to use this information to select 
the most appropriate set of goals to achieve. However, we 
must also take the value of the goals into account. Thus the 
problem of choosing the. set of goals becomes a sort of bin- 
packing problem in which we are trying to pack the most 
value into the bins of available resources. For example, sup- 
pose that the rover is at location 1, but has only four units of 
energy available. If we construct a plan graph starting at lo- 
cation 1 and do the cost estimation, the graph will tell us that 
three goals are possible:. SamplejJmagej, and Image 2 , with 
costs of 3, 1 and 3 respectively; Solving the (trivial) bin- 
packing problem, we see that it is possible to achieve either 
Samplej & Image 1; or Image 1 & Image 2 . If samples are 
worth morie than images, then the first option is better. Oth- 
erwise, the second option is better. 

While this approach works for this very simple example, 
it generally doe_s not. work well for the_ rove r problem . The 
reason is that the cost of moving to a target depends heavily 
on the location of the previous target. Unfortunately, the 
heuristic distance estimates derived from a planning graph 
implicitly make the assumption that the goals are indepen- 
dent. For example, if the rover starts at location 0 with eight 
units of energy, the plan graph tells us that the three objec- 
tives are possible: Sampleulmagei, and Image 2 with costs 7, 

5 and 5 respectively. Based on these costs we would be led 
to the conclusion that only one experiment can be performed 
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while in reality it is possible to achieve Sample} & Image}, 
or Image} & Image^ 

The problem is that in the plan graph in Figure 2, the cost es- 
timate for getting to location 1 assumes we are coming from 
location 0, likewise for locations 2 and 3. However, locations 
1 and 2 are close together, so the cost estimate of 4 for get- 
ting to location 2 no longer applies if we choose to go to lo- 
cation 1 first. Thus, the plan graph might allow us to pick the 
best or nearest single goal to accomplish, but it provides lit- 
tle immediate guidance if we want to visit several locations 
and achieve several goals in succession. 

Several researchers have augmented plan graph cost estima- 
tion techniques to better account for interaction between ac- 
tions in the graph. In particular, Hoffman (2002, 2003), Do 
& Kambhampati (2003), Edelkamp (2003) and Gerevini 
(2003) extract a relaxed plan from the planning graph and 
use this plan to estimate cost. However, this technique is pri- 
marily aimed at accounting for action duplication in estimat- 
ing the cost of achieving a well-defined goal. For an over- 
subscription problem, this technique might improve the es- 
timates for individual goals, but it does not solve the prob- 
lem of interaction between the goals. The fundamental 
problem is that the resulting cost estimates assume that the 
goals are independent, and they are not. Thus, plan graph 
cost estimation alone does not seem to provide an adequate 
mechanism for choosing goals and goal ordering in such 
problems. 

3. The Orienteering Problem 

To overcome the difficulties mentioned above, we observe 
that there is a strong similarity between the rover problem 
and a variant of the traveling salesman problem know as an 
orienteering problem (Keller, 1989). In an orienteering 
problem, we are given a set of cities, a prize for each city 
(possibly zero), and a network of roads between cities. The 
objective is for a salesman to collect as much prize money as 
possible given a fixed amount of gas. The orienteering prob- 
lem has been studied extensively in the operations research 
literature, and both exact and approximate algorithms have 
been developed for solving this problem (Keller, 1989). To 
recast the rover problem as an orienteering problem, the cit- 
ies become target sites, and the roads are paths between dif- 
ferent targets, with costs corresponding to the resources 
required for the rover to traverse the path. The prizes are the 
scientific values of the experiments at a given target site. 
However, since there can be multiple experiments possible 
at a given site, and there are time and resource costs associ- 
ated _wi'th .each., experiment, ..we need to create .a separate 
“city” in the graph for each experiment at a site. We then add 
directed edges from the site to the experiments at that site, 
and return edges from the experiments back to the site. The 
resulting graph for our simple rover example is shown in 
Figure 3. 

We can assign a cost of zero to the return edges from an ex- 
periment to a site. However, an edge from a target site to an 
experiment should be assigned a cost that reflects the time 
and resources required to perform that experiment. Thus, for 



Figure 3: Orienteering graph for the rover problem. Cost 

estimates for each experiment are obtained using plan-graph cost 
estimation techniques. 

the sample at targetl, we label the edge from targetl to 
samplel with a cost of 3, corresponding to the cost of obtain- 
ing samplel, once we are at targetl. In our simple example, 
each experiment is only a single step, so it is easy to come 
up with these numbers. In reality, experiments require many 
steps, and planning is required to generate these steps. We 
obtain the numbers for experiment edges in the orienteering 
problem using piah graph cost estimates, as described in the 
previous section. In particular, we ignore rover location in 
the plan graph by assigning a cost of zero to all locations. We 
can then compute cost estimates for all the objectives in the 
plan graph. These estimates provide the numbers that we 
need for each experiment edge in the orienteering problem. 
We can summarize the approach as follows: 

1. Construct a plan graph and use it to estimate the time 
and resources required to perform the different experi- 
ments at each target site. (This can be accomplished for 
all sites simultaneously by assigning a cost of zero to 
all locations in the graph.) 

2. Construct an orienteering problem,, like that shown in 
Figure 3, using path planning to compute the edge cost 
for each move between sites, and using the estimates 
computed in Step 1 for the edge costs between a site 
node and the science experiments at that site. 

3. Solve the orienteering problem and use the resulting 
solution to guide the planning process. 

A solution to this orienteering problem suggests which sites 
the rover should visit, the order in which to visit those sites, 
and which experiments should be performed at those sites. 
This information can then be used as heuristic guidance for 
a planner. 

Searching for Plans — - 

It might appear that solving the orienteering problem pro- 
vides an exact solution to the rover’s planning problem, and 
that no additional planning would therefore be necessary. 
While this is true in bur very simple example, it is not true in 
general. There are several reasons for this: 

• Some of the steps for an individual experiment may 
interfere with each other. As a result, the experiment 
cost estimates obtained from the plan graph may be 
inaccurate. 
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• Different experiments can share steps or can inter- 
fere with each other. For example, obtaining a sam- 
ple and taking a close up image both involve 
deploying the arm, so it is possible to share this step 
if both experiments are performed. This interaction 
is not modeled by our abstracted orienteering prob- 
lem, which implicitly assumes that the only interac- 
tion between experiments results from the location 
of the rover. 

• There may be time constraints on certain experi- 
ments (perhaps due to illumination constraints), or 
there may be required events (like communication) 
that must occur at specific times. These constraints 
are not reflected in the orienteering problem, so the 
resulting solution may be flawed. 

As a result, the orienteering problem is only an abstraction 
of the real planning problem, and the solution to the orien- 
teering problem may not prove to be a solution to the actual 
planning problem. 

Suppose that we have a solution to the orienteering prob- 
lem, and use it as heuristic advice to a planner to suggest the 
goals and the order in which to achieve those goals. When 
detailed planning is performed,' the resulting plan could turn 
out to be much better or worse than that predicted by the 
heuristic estimates. In either case, it may be desirable to con- 
tinue searching for a better plan. One possible approach is to 
search for another solution to the orienteering problem and 
try this as heuristic advice. Many of the algorithms devel- 
oped for solving the orienteering problem (Keller 1989) are 
based on either local search, or branch and bound, and can 
therefore be adapted to provide a stream of solutions to the 
problem. A more ambitious possibility is to update the edge 
costs for the orienteering problem to reflect the actual values 
found in planning. One could then solve the orienteering 
problem again, and use the updated solution to continue 
guiding the planning process. 

4. Generalizing to Multiple Interactions 

Thus far, we have assumed that the only strong source of in- 
teraction between experiments is the location of the rover. 
This assumption allowed us to construct an orienteering 
problem in which the “cities” correspond to locations and 
experiments. While this is a far better model than that pro- 
vided using only plan .graph.cost. estimation techniques, it 
may not provide sufficient guidance for some problems. For 
example, suppose that a rover instrument takes significant 
warm up time, but once it is warm, it can be kept warm for 
-additional -experiments with little additional- -cost. -In -this .. 
case, there may be considerable advantage to performing a 
sequence of experiments at different sites using that instru- 
ment. This violates our assumption, that “location” is the 
only rover attribute for which there is strong interaction be- 
tween experiments. In order to fix this problem, we need to 
consider instrument-status, along with rover location in our 
creation of the orienteering problem. Specifically, the cities 
in the orienteering problem now become location/instru- 
ment-status pairs. In effect, we are now solving the orien- 


teering problem on a projection of the state space for the 
rover - a projection onto the two predicates location and in- 
strument-status. 

For our simple rover problem suppose that the instrument 
is required for the two imaging operations, but not for col- 
lecting the samples. The graph would consist of two copies 
of the orienteering graph from Figure 3, one for the instru- 
ment off, and the other for the instrument on. The two graphs 
would be cross connected by the operations of turning the 
instrument on or off, as shown in Figure 4. Because the im- 



Figure 4: Orienteering problem for cross product of location and 
instrument status. 


aging operations require that the instrument be on, these two 
objectives only appear in the bottom half of the graph, where 
the instrument is on. However, the sampling operations, 
don’t rely on the instrument, so they appear in both the top 
and bottom halves of the graph. 

This brings up an interesting issue: the graph in Figure 4 
contains two copies of some of the objectives, Sample! and 
Sample 2 . As a result, it is possible for the solution algorithm 
to collect the reward for this goal twice, by fixst visiting it in 
the upper part of the graph, then transitioning to the lower 
part (or vice versa). To fix this problem we add mutual ex- 
clusion (mutex) edges between all pairs of identical objec- 
tives* appearing in 'the graph. We then modify *the“Solution 
algorithm for the orienteering problem to respect those mu- 
tex constraints. This turns out to be fairly simple; the solu- 
tion algorithm already keeps track of which cities have been 
visited so that it does not collect rewards twice when return- 
ing to a city. All that is necessary is that when a city is visited 
for the first time, we also add any mutex cities to the set of 
visited cities. This prevents the collection of any reward 
when visiting those cities. 
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For our simple rover problem, the structure of the graph 
in Figure 4 seems fairly obvious. However, if we wish to ap- 
ply this technique more broadly, we need to be able con- 
struct this graph automatically. This turns out to involve 
several steps. 

First, assume that we have the set O of predicates that will 
form the states in our orienteering problem. What we want 
to construct is the projection of the problem state space onto 
that set of predicates. However, this is generally intractable, 
since it requires that we enumerate the entire state space, 
project the states onto the desired predicates (get rid of the 
other predicates), and then combine identical projected 
states. Instead, we can construct an optimistic approxima- 
tion of the projected state space by starting with a projection 
of the initial state,, applying all applicable actions to this 
state, and projecting the resulting states. We repeat this pro- 
cess until no new projected states are found. This algorithm 
is summarized in Figure 5. 

1. Let I = the initial state projected onto the predicates O 

2. Let States={I), Nodes={I), Edges={ } 

3. While States is non-empty 

a. Let s = Pop(States) 

b. For each action a applicable in s: 

Let s’ be the projection onto O of the result of 
applying a to s 

Unless s’ is in Nodes, add s’ to Nodes. 

Unless s’=s, add the edge <s,s’> to edges 
Figure 5: Projected state space construction 

For our example, the result of this process is the graph in 
Figure 4, but without the goal (experiment) nodes and edges. 

Next, we need to add the goal (experiment) nodes and 
edges to the graph. To do this we need to figure out which 
goal (experiment) nodes connect to which projected states in 
the orienteering graph. To determine this, we construct a re- 
laxed plan for each goal in the plan graph. However, in doing 
this construction, we assume that all propositions in the ori- 
enteering graph are available in the initial conditions of the 
plan graph. In other words, in constructing the relaxed plan, 
we stop the backward search on a proposition if it has cost 
zero. The resulting relaxed plan will rely on a (possibly emp- 
ty) set of “initial conditions” or zero-cost propositions be- 
longing to the orienteering graph. This set of propositions 
corresponds to one or more of the states in the orienteering 
graph. As a result, we add a copy of the goal to the graph for 
each such state, and connect it to that state. In our rover ex- 
ample, the relaxed plan for Image] relies on the initial con- 
ditions Locj and Instrument-on. As a result only one copy of 
the goal is added to the orienteering graph and is connected 
to the state {Loc x , Instrument-on). In contrast, the relaxed 
plan for Sample] relies only on Locj so two copies are add- 
ed, one for the projected state {Loc 1; Instrument-off) and 
one for the projected state {Locj, Instrument-on). Similar 
arguments apply to Image 2 and Samplej. 


An edge from a state node to a goal node is assigned a 
cost equal to the cost of the relaxed plan for that goal. Thus, 
for our example, the edges from the state node {Locj, Instru- 
ment-on) to the goals Image] and Sample] are the cost of 
their respective relaxed plans (1 and 3). Finally, we mark any 
duplicate pairs of goals in the graph as mutex. This algo- 
rithm is summarized in Figure 6. 

1 . Construct a projected state space for the predicates in 

O 

2. Mark all state space propositions as having zero cost in 

the plan graph 

3. For each goal g: 

a. Construct a relaxed plan p for g (halting at zero cost 
propositions in the plan graph) 

b. Let f be the initial or foundation propositions for p 

c. For each state Sj in the projected state space consis- 
tent with f, add a node g ; to the graph with reward 
equal to that of the goal g. 

d. Add an edge from S; to gj with cost equal to the cost 
of the relaxed plan p. Add a return edge from g; to 
s; with cost 0. 

e. Add a mutex edge between all nodes gj. 

Figure 6 : Orienteering graph construction 

Identifying Strong Interactions 

For the rover problem, it seems fairly clear which at- 
tributes should be treated in the orienteering graph and 
which_can be estimated using planning graph cost estima- 
tion. However, if we wish to apply these techniques to gen- 
eral planning problems, we need some way of automatically 
deciding which attributes belong in the orienteering graph. 

To do this, we can perform a kind of sensitivity analysis on 
the plan graph to find those attributes that have significant 
impact on the cost of achieving each goal. For example, in 
the plan graph shown in Figure 2, consider the relaxed plan 
for each one of the objectives. We note that all of these plans 
affect the location of the rover and leave it in a state other 
than the initial state. We therefore change the location in the 
initial conditions of the plan graph to see what impact this 
has on the cost estimates for different objectives. For the 
goal Samplel, the estimated cost varies from 5 to infinity, 
depending on the initial location.'As a result, location seems ' 
like a good candidate for treatment in the orienteering graph. 
Similarly, if the status of an instrument has a significant im- 
pact on the cost of achieving some of the objectives, it too 
" wouJd'belFgodd candidate for therinehteUrmg^gfaph. Using 
this technique, we can automatically identify those attributes 
for which the goals strongly interact. 

5. Implementation and Experiments 

Currently, we have only a partial implementation of the tech- 
niques presented in the previous sections. In particular, our 
implementation does not yet automatically decide which 
predicates form the basis of the orienteering graph. Further- 
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more, our automated construction of the orienteering graph 
currently only handles single predicates. The implementa- 
tion consists of: 

1 . Plan graph cost estimation - a simple plan graph is con- 
structed to quiescence for the planning problem. 

2. Cost estimation - given a basis set of predicates to be 
ignored (such as location), costs are computed for the 
objectives (goals) by setting the costs for ignored pred- 
icates to zero and performing the standard forward 
sweep through the plan graph. 

3. Relaxed plan extraction - relaxed plans are extracted 
from the planning graph for each objective, assuming 
that all propositions with zero cost are available in the 
initial conditions. 

4. Orienteering problem construction - given a single 
basis predicate (such as location), an orienteering prob- 
lem is constructed corresponding to the projection of 
the state space onto that predicate. This projection is 
extracted from the propositions and set of transitions 
present in the plan graph for that predicate. 

5. Goal node addition - each objective (goal) is added to 
the orienteering graph. It is connected to the projected 
state node in the orienteering graph corresponding to 
the zero-cost proposition used in the relaxed plan for 
that goal. 

6. Orienteering problem solution - A beam search (using 
a greedy heuristic lookahead function forevaluation) is 
used to find solutions to the orienteering problem. 

7. Planner guidance - the best solution to the orienteering 
problem is used to supply the goals to a POCL planner. 
The goals are fed to the planner one at a time in the 
order suggested by the solution to the orienteering 
problem. The planner can link to actions already in the 
plan structure, but cannot violate existing causal links. 
Planning terminates when resources are exhausted, or 
no remaining goals (from the solution to the orienteer- 
ing problem) can be achieved. 

We have performed some preliminary experiments with this 
system on rover problems involving 10, 25, 50, and 100 
rocks randomly distributed in a 50x50 square. Between 1 
and 3 experiments were available at each rock with experi- 
ment values chosen randomly in the range of 1 to 5. 75% of 
-the n 2 paths between.rocks were assumed, to be traversable, 
with costs equal to the distance along the path. For the large 
problems, we gave the rover sufficient resources to allow it 
to visit approximately 10% of the rocks. For smaller prob- 
-lems-we -tried a range of resource values. We used a beam 
width of 25 when searching for solutions to the orienteering 
problem. 

In all cases, construction and solution of the orienteering 
problem is very fast (0.3 seconds for the most difficult prob- 
lems). Our technique for solving the orienteering problem is 
approximate, so we are not guaranteed to find the optimal 
solution. However, by performing experiments with very 
large beam width, we believe that we are obtaining optimal 
solutions for smaller problems and solutions within a few' 


percent of optimal for the largest problems. Solution quality 
tends to drop off with a beam width of less than 15, and a 
beam width of greater than 50 slows down the solution pro- 
cess significantly. For these problems, a beam width of 25 
seems to provide a good compromise between solution 
speed and solution quality. 

We have not yet completed comparisons of the resulting 
plan quality against plan quality using greedy search strate- 
gies. However, our preliminary results indicate that we are 
typically getting plan quality improvements averaging from 
10% to 30% depending on the density of goals. When the 
field is rich in goals, greedy approaches tend to work reason- 
ably well. But when the goals are widely scattered, the 
choice of goals and goal order can make a large difference in 
net reward. Figure 7 shows a typical comparison of plan re- 
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Figure 7: Plan quality (reward) for orienteering guidance and 
greedy guidance of a planner on a moderate sized problem 
involving 25 rocks in a 50x50 area. 

ward for the orienteering approach, and a fairly sophisticat- 
ed greedy strategy as resources available to the planner 
range from 0 to 50. Eventually, as resources become plenti- 
ful, the greedy solutions catch up to the orienteering solu- 
tions. 

6. Related Work 

Few planning systems are able to solve over-subscription 
problems. Those that can are usually hand crafted for a spe- 
cific domain and deal primarily with scheduling rather than 
planning problems.TExamples of this include thelcfieduling 
systems for the Hubbell space telescope (Kramer & 
Giuliano, 1997), for SIRTF (Kramer, 2000), and for the 
Landsat 7 satellite (Potter & Gasch 1998). Recently Kramer 
and Smith (2003) have investigated some heuristics for re- 
tracting tasks in over-subscription scheduling problems. 
However, it is not clear that these heuristics can address the 
kind of strong interactions found in the rover problem, or 
can be easily applied to planing problems. The Aspen sys- 
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tem (Chien et. a 1, 2000) does use a local search approach to 
planning for over-subscription problems. However, it relies 
on simple greedy heuristics together with hand-coded do- 
main-dependent search control information. 

Markov Decision Processes (MDP) naturally permit the 
expression of multiple objectives and values for the objec- 
tives. Policy or value iteration can then be used to find opti- 
mal plans. However, it is tricky to prevent repeated 
collection of the same reward - in order to do this, one must 
add an additional proposition to the state for each possible 
goal. This increases the size of the state space by a factor of 
two for each possible goal. 

Many recent planning systems make use of solutions to ab- 
stracted versions of the planning problem to guide planner 
search. These approaches typically extract relaxed plans 
from a planning graph, and use these relaxed plans to pro- 
vide heuristic guidance to the planner (Do & Kambhampati, 
2003; Edelkamp, 2003; Hoffman, 2002, 2003). The most so- 
phisticated of these cost estimation methods are found in 
Metric-FF (Hoffman, 2003), which incorporates continuous 
variables into the planning graph, and SAPA (Do & Kamb- 
hampati, 2003), which considers time/cost tradeoffs in its 
heuristic calculations. Here we are constructing an orien- 
teering problem to serve as the relaxed planning problem, 
rather than relying solely on a plan graph. As we argued in 
Section 2, a plan graph is not an adequate model for many 
over-subscription problems. However, we do continue to 
rely on plan graph cost-estimation techniques in order to 
seed the orienteering problem. 

Long and Fox (2000), and Ben Smith have considered the 
use of specialized algorithms like TSP solvers within a plan- 
ner. However, these algorithms have been considered for the 
purpose of solving subproblems encountered during plan- 
ning. Instead, we are considering the use of such algorithms 
for solving an abstracted version of the entire problem, and . 
using the result to provide heuristic guidance to a planner. 

7. Conclusions 

In this paper, we developed a novel technique for solving 
over-subscription planning problems. The technique in- 
volves constructing an abstracted version of the planning 
problem and then using the resulting solution(s) to provide 
heuristic advice to the planner on the goals and steps that 
should be considered, and the order in which they should be 
considered. The abstracted version of the problem is formed, 
by first estimating the costs of achieving each different ob- 
jective (goal) using a plan graph. This information is then 
used to construct an orienteering problem. Solutions to the 
orienteering problem then provide the heuristic information 
needed for guiding the planner. 

Although we presented the technique in the context of a rov- 
er problem, the technique applies to over-subscription prob- 
lems more generally. In particular, the orienteering graph 
approach is useful whenever the order in which objectives 
are achieved has a strong influence on the cost of achieving 
those objectives. For such problems, simple greedy search 
strategies are not likely to work well. 


The difficulty in applying this approach to general over- 
subscription planing problems is in recognizing which pred- 
icates should be part of the orienteering graph, and which 
can be treated using conventional plan graph cost estimation 
methods. We presented a technique for making these deci- 
sions automatically. It performs a kind of sensitivity analysis 
in the plan graph to determine how the cost of achieving 
each objective depends on the achievement of other objec- 
tives. If the cost of one or more objectives is highly sensitive 
to Conditions that are changed when achieving other objec- 
tives, then those predicates are good candidates for the ori- 
enteering graph. 

In Section 5, we indicated a number of limitations in our 
current, very preliminary implementation. We are in the pro- 
cess of removing these limitations and expect to have a more 
complete working system in the next several months. There 
are, however, some deeper issues that we have not yet ad- . 
dressed. As we mentioned in Section 3, time constraints on 
objectives, and on required activities such as communica- 
tions, are not considered in the orienteering graph. Although 
there has been little work on solving orienteering problems 
with time constraints, it seems iikeiy that some of the algo- 
rithms could be adapted to deal with them. This could fur- 
ther improve the accuracy of the approximate solutions, and 
thereby produce better search guidance for a planner. How- 
ever, there is likely to be a cost to this increased accuracy, 
and it remains to be seen whether this additional accuracy 
will pay off. 
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